An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

نویسنده

  • Ohad Shamir
چکیده

We consider the closely related problems of bandit convex optimization with two-point feedback, and zero-order stochastic convex optimization with two function evaluations per round. We provide a simple algorithm and analysis which is optimal for convex Lipschitz functions. This improves on Duchi et al. (2015), which only provides an optimal result for smooth functions; Moreover, the algorithm and analysis are simpler, and readily extend to non-Euclidean problems. The algorithm is based on a small but surprisingly powerful modification of the gradient estimator.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Stochastic convex optimization with bandit feedback

This paper addresses the problem of minimizing a convex, Lipschitz function f over a convex, compact set X under a stochastic bandit feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value f(x) at any query point x ∈ X . The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm’s query points...

متن کامل

Regret Analysis for Continuous Dueling Bandit

The dueling bandit is a learning framework wherein the feedback information in the learning process is restricted to a noisy comparison between a pair of actions. In this research, we address a dueling bandit problem based on a cost function over a continuous space. We propose a stochastic mirror descent algorithm and show that the algorithm achieves an O( √ T log T )-regret bound under strong ...

متن کامل

An optimal algorithm for bandit convex optimization

We consider the problem of online convex optimization against an arbitrary adversary with bandit feedback, known as bandit convex optimization. We give the first Õ( √ T )-regret algorithm for this setting based on a novel application of the ellipsoid method to online learning. This bound is known to be tight up to logarithmic factors. Our analysis introduces new tools in discrete convex geometry.

متن کامل

Particle Swarm Optimization with Smart Inertia Factor for Combined Heat and Power Economic Dispatch

In this paper particle swarm optimization with smart inertia factor (PSO-SIF) algorithm is proposed to solve combined heat and power economic dispatch (CHPED) problem. The CHPED problem is one of the most important problems in power systems and is a challenging non-convex and non-linear optimization problem. The aim of solving CHPED problem is to determine optimal heat and power of generating u...

متن کامل

Optimal Algorithms for Online Convex Optimization with Multi-Point Bandit Feedback

Bandit convex optimization is a special case of online convex optimization with partial information. In this setting, a player attempts to minimize a sequence of adversarially generated convex loss functions, while only observing the value of each function at a single point. In some cases, the minimax regret of these problems is known to be strictly worse than the minimax regret in the correspo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Journal of Machine Learning Research

دوره 18  شماره 

صفحات  -

تاریخ انتشار 2017